Fix type issue introduced by #28 by HaileyStorm · Pull Request #39 · xjdr-alt/entropix

HaileyStorm · 2024-10-07T22:33:16Z

Commit #28 changed apply_rotary_embed to have dtype parameter with default float32, and forces attention softmax to be done float32. Since attention doesn't specify the dtype parameter when calling apply_rotary_embed, and the output matmul doesn't convert back from float32 to match the values type, this is an issue if you're running BF16.

This specifies the existing xq.dtype for the dtype parameter when calling apply_rotary_embed (alternatively, we could cast keys to float32 in scores = torch.matmul(xq, keys)), and casts scores to match values at the output matmul.

Commit xjdr-alt#28 changed `apply_rotary_embed` to have dtype parameter with default float32, and forces attention softmax to be done float32. Since `attention` doesn't specify the dtype parameter when calling `apply_rotary_embed`, and output matmul doesn't convert back from float32 to match the values type, this is an issue if you're running BF16. This specifies the existing xq.dtype for the dtype parameter when calling `apply_rotary_embed` (alternatively, we could cast keys to float32 in `scores = torch.matmul(xq, keys)`), and converts the scores to match values at the output matmul.

xjdr-alt · 2024-10-08T15:07:21Z

@Arrabonae could you take a look

citizenhicks

tested this, works well. thanks for spotting this issue!

citizenhicks approved these changes Oct 8, 2024

View reviewed changes

stillmatic mentioned this pull request Oct 10, 2024

Fixing Dtype error: expected scalar type BFloat16 but found Float for Torch parts #59

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix type issue introduced by #28#39

Fix type issue introduced by #28#39
HaileyStorm wants to merge 1 commit intoxjdr-alt:mainfrom
HaileyStorm:patch-1

HaileyStorm commented Oct 7, 2024 •

edited

Loading

Uh oh!

xjdr-alt commented Oct 8, 2024

Uh oh!

citizenhicks left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

HaileyStorm commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xjdr-alt commented Oct 8, 2024

Uh oh!

citizenhicks left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HaileyStorm commented Oct 7, 2024 •

edited

Loading